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Abstract 

The  Computer  and  Vision  Research  Center  conducts  a  broad  program  of  research 
in  computer  vision,  image  processing,  and  architectures  for  image  processing.  During 
the  period  of  this  report,  several  projects  were  completed  including  those  on 
positioning  and  tracking  of  objects  moving  in  space,  parallel  image  processing,  and  3- 
D  representation  and  recognition.  The  results  on  five  projects  are  briefly  presented  in 
the  report. 

A.  Reconstruction  and  Matching  of  3-D  Objects  using  Quadtrees/Octrees. 

B.  3-D  Model  Construction  from  Multiple  views  Using  Range  and  Intensity  Data. 

C.  Parallel  algorithms  are  delineated  for  the  important  task  of  image  normalization. 

D.  A  versatile  surface  representation  based  upon  the  earlier  volumetric  description 

developed  at  our  Laboratory  has  been  formulated. 

/ 

E.  Interpretation  of  Structure  and  Motion  from  Line  Correspondences. 


During  the  period  January  1,  1985  through  December  31,  1985,  our  group  made 
20  presentations,  published  9  papers  in  refereed  journals,  1 1  in  conference  proceedings, 
1  technical  report  and  1  non -refereed  abstract.  A  complete  listing  of  these  activities  is 
provided  at  the  end  of  this  report. 
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A.  RECONSTRUCTION  AND  MATCHING  OF  3-D 
OBJECTS  USING  QUADTREES/OCTREES 

A.l.  INTRODUCTION 

The  need  for  efficient  3-D  object  representations  is  crucial  in  computer  vision,  com¬ 
puter  graphics,  computer-aided  design  and  other  related  areas.  Several  representation 
schemes  have  been  proposed  [1-4].  Representations  are  usually  determined  by  the  data 
acquisition  techniques  or  by  the  type  of  application.  For  instance,  a  surface  representation 
is  suitable  for  graphic  displays  of  opaque  objects,  whereas  it  is  easier  to  perform  opera¬ 
tions  such  as  matching  and  interference  analysis  with  volumetric  representations.  A  com¬ 
mon  problem  with  most  representation  techniques  is  that  requirements  for  memory  and 
processing  time  grow  as  exponential  or  quadratic  functions  of  the  input  image  size.  This 
calls  for  a  compact  data  structure  that  allows  efficient  algorithms  to  be  implemented  on  it. 
The  octree  structure  [5-11]  with  efficient  tree  traversal  algorithms  is  such  a  candidate. 

In  general,  an  octree  can  be  generated  from  a  3-D  binary  array  using  a  recursive  divi¬ 
sion  and  subdivision  procedure.  However,  the  acquisition  of  such  a  volume  description  is 
not  a  trivial  problem.  3-D  object  structure  can  be  derived  from  2-D  images.  This  task  has 
been  the  primary  concern  of  computer  vision  researchers.  To  resolve  the  3-D  reconstruc¬ 
tion  problem,  Chien  and  Aggarwal  [10-11]  proposed  a  scheme  to  generate  octrees  from 
three  orthogonal  views  of  objects  using  a  volume  intersection  technique  [12-13].  Each 
view  is  extended  along  the  associated  viewing  direction  to  form  a  cylinder.  Each  cylinder 
is  described  by  a  pseudo-octree.  The  octree  of  the  object  is  generated  by  intersecting  the 
three  pseudo-octrees. 

In  this  research,  this  algorithm  is  extended  to  generate  the  ’generalized  octree’  of  an 
object  from  three  known  non-coplanar  views,  which  are  not  necessarily  orthogonal  to  each 
other.  Each  unit  volume  (voxel)  associated  with  a  node  in  a  generalized  octree  is  a  paral¬ 
lelepiped,  with  the  three  sides  specified  by  the  three  viewing  directions. 


«y«j|  >.l  W  -*  v  V  ">■  A’/VV  ■AVVV  V  r  r.r  r.rj.-.j.-.- 


A  ’regular  octree’,  with  each  voxel  being  a  cube,  is  a  special  case  of  the  gen¬ 
eralized  octree  structure.  It  is  known  that  in  some  cases  a  finite  number  of  views  is  not 
enough  to  reconstruct  the  exact  3-D  structure  of  an  object.  The  more  views  of  an  object 
that  are  given,  the  more  accurate  is  the  description  of  the  object  that  can  be  obtained.  An 
object  description  scheme  should  be  conducive  to  refinement  with  additional  information. 
The  proposed  generalized  octree  structure  allows  subsequent  refinement  of  the  representa¬ 
tion  as  additional  views  are  available.  The  basic  principle  of  the  algorithm  for  refining 
octrees  is  similar  to  that  of  the  algorithm  for  intersecting  pseudo-octrees. 

To  perform  object  matching,  the  representation  of  the  objects  should  be  location  and 
orientation  invariant.  The  generalized  octree  structure  does  not  meet  these  criteria,  since  it 
is  dependent  on  viewing  directions.  A  common  scheme  to  solve  this  problem  is  to  project 
a  generalized  octree  onto  the  images  planes  of  the  three  principal  views  (along  the  princi¬ 
pal  axes)  to  obtain  the  three  ’principal  quadtrees’,  and  to  perform  matching  based  upon  the 
principal  quadtrees.  Computing  principal  axes  requires  the  computation  of  the  (3  x  3) 
moment  of  inertia  matrix  comprising  second  order  moments.  To  speed  up  processing, 
computation  of  these  moments  are  performed  based  upon  a  ’generalized  coordinate  sys¬ 
tem’  specified  by  the  three  viewing  directions.  The  coordinate  transformation  (from  the 
generalized  coordinate  system  to  the  Cartesian  coordinate  system)  is  applied  only  to  the 
moment  of  inertia  matrix.  The  three  principal  axes  can  be  obtained  from  the  transformed 
moment  of  inertia  matrix  by  computing  its  eigenvectors.  A  ’coarse’  matching  is  per¬ 
formed  by  matching  the  principal  quadtrees  of  the  unknown  object  against  those  of  a 
number  of  models.  A  smaller  set  of  models  with  lower  degree  of  dissimilarities  are 
selected.  The  octrees  of  the  observed  object  and  models  are  generated  and  a  ’fine’  match¬ 
ing  is  applied  to  octree  pairs  in  order  to  identify  the  object.  These  results  were  presented 
at  the  Third  Workshop  on  Computer  Vision:  Representation  and  Control,  Bellaire,  Michi¬ 
gan,  October  13-16,  1985. 
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B.  3-D  MODEL  CONSTRUCTION  FROM  MULTIPLE  VIEWS 
USING  RANGE  AND  INTENSITY  DATA 


B.I  Introduction 

Automatic  generation  of  computer  models  of  the  surfaces  of  arbitrarily  shaped,  three 
dimensional  objects  is  an  important  problem  in  computer  vision.  In  the  past  a  number  of 
different  techniques  have  been  used  for  representation  and  modeling  of  3-D  objects  for 
computer  vision  applications  [l]-[7].  However,  there  is  an  absence  of  a  fast  and  robust 
technique  for  building  3-D  models  of  arbitrarily  shaped  objects.  In  this  paper,  we  describe 
a  computationally  efficient  technique  for  automatic  construction  of  3-D  models  of  objects 
given  multiple  views  of  range  and  intensity  data. 

The  process  of  constructing  3-D  models  of  objects  involves  first,  integrating  data  or 
structured  descriptions  from  multiple  views  of  an  object  and  then  generating  a  representa¬ 
tion  of  the  complete  object.  In  general  integrating  data  or  structured  descriptions  acquired 
from  multiple  views  involves  establishing  correspondence  between  the  views  and  deter¬ 
mining  the  appropriate  interframe  transformations  to  register  the  views.  The  difficult  and 
time  consuming  step  in  the  above  process  is  the  matching  step  required  to  establish  a 
correspondence.  Much  of  the  previous  research  efforts  have  been  directed  towards  solving 
the  difficult  correspondence  problem. 

Several  matching  techniques  have  been  developed  in  the  past  for  solving  this 
correspondence  problem.  Potmesil  [3]  generates  models  of  3-D  objects  by  spatially 
matching  3-D  surface  segments  describing  the  objects.  His  matching  algorithm  uses 
heuristic  search  to  align  overlapping  surface  segments  of  an  object  into  a  common  3-D 
coordinate  system.  Bhanu  [1]  has  developed  an  interactive  technique  for  constructing  3-D 
models  of  objects.  The  model  is  constructed  by  rotating  the  object  through  a  known  angle 
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to  acquire  multiple  views.  Coordinates  of  points  from  the  multiple  views  are  then 
expressed  in  one  reference  coordinate  system,  assuming  that  the  interframe  transformations 
are  known  a  priori.  Ferri  and  Levine  [4]  discuss  a  technique  for  piecing  together  the  3-D 
shape  of  moving  objects.  They  construct  the  model  of  an  object  by  first  computing 
descriptions  of  visible  surfaces  of  an  object  from  each  view  of  a  set  of  multiple  views. 
Then  the  interframe  transformations  which  register  these  images  with  respect  to  a  refer¬ 
ence  coordinate  system  are  computed  (using  a  feature  based  matching  algorithm),  thereby 
allowing  for  the  reconstruction  of  surface  descriptions  in  a  world  coordinate  system. 
Boyter  and  Aggarwal  [5]  present  a  technique  for  recognizing  3-D  objects  using  range  and 
intensity  data.  They  construct  the  model  of  an  object  by  rotating  the  object  through  known 
angles  and  collecting  the  range  and  intensity  line  images  for  each  object  position.  Bhanu 
et  al.  [2]  describe  a  3-D  model  building  technique  that  is  based  on  CAGD  (Computer 
Aided  Geometric  Design)  techniques.  The  3-D  data  is  obtained  from  a  CAGD  model  of 
an  object  and  the  object  is  represented  by  planar  approximations.  The  planar  approxima¬ 
tions  are  merged  using  a  spatial  proximity  graph,  to  obtain  a  structured  collection  of  large 
faces.  Maggee  et  al.  [6]  present  a  technique  for  recognition  of  3-D  objects  through  inten¬ 
sity  guided  range  sensing.  Models  are  represented  by  a  graph  structure,  wherein  each  node 
denotes  a  feature  and  arcs  between  nodes  depict  the  geometric  relationships.  Boyter  and 
Aggarwal  [7]  present  an  algorithm  for  recognition  of  polyhedra  from  range  data.  The 
polyhedral  models  are  represented  by  3-D  coordinates  of  vertices,  the  plane  equations  of 
each  face  and  ordered  lists  of  vertices  that  bound  the  faces. 

Most  of  the  methods  discussed  above  can  be  classified  as  correspondence  based 
methods.  These  methods  are  computationally  expensive  due  to  the  large  search  space  that 
needs  to  be  explored  for  establishing  correspondence.  It  may  be  noted  that  none  of  the 
methods  discussed  above  utilize  information  about  the  imaging  geometry  that  is  readily 


available  when  constructing  models.  In  this  paper  we  present  a  technique  for  automatic 
model  construction,  given  the  range  and  intensity  data.  The  technique  presents  a  simple 
way  of  integrating  information  from  multiple  sensors  namely,  range  and  intensity. 
Another  important  feature  of  our  method  is  that  no  point  correspondences  are  required  to 
determine  the  interframe  transformations  needed  to  express  the  points  from  each  view  in  a 
common  reference  coordinate  system. 

The  range  and  intensity  data  are  obtained  using  a  commercially  available  laser  scan¬ 
ning  system  [8],  which  works  on  the  principle  of  light  sheet  triangulation.  The  object  is 
placed  in  its  stable  position  on  a  flat  surface  called  the  base  plane.  The  base  plane  is 
encoded  with  a  pattern  consisting  of  a  single  straight  line.  The  object  is  positioned  on  the 
base  plane  such  that  the  base  plane  pattern  is  fully  or  partially  visible  from  every  viewing 
angle.  Multiple  views  of  the  object  are  generated  by  rotating  the  base  plane  about  some 
arbitrarily  fixed  axis  perpendicular  to  and  on  it.  By  observing  the  orientation  of  the  base 
plane  pattern  in  the  intensity  images  of  adjacent  views,  the  interframe  transformation  can 
be  easily  deduced.  Once  the  interframe  transformation  is  known,  all  the  (range)  data  are 
transformed  into  a  reference  coordinate  system  and  merged.  A  region  description  of  the 
object  may  then  be  obtained  using  the  algorithm  presented  by  Vemuri  et  al.  [9],  In  this 
representation,  3-D  object  surfaces  are  represented  by  regions  that  are  a  collection  of  sur¬ 
face  patches  homogeneous  in  certain  intrinsic  surface  properties.  An  important  aspect  of 
this  representation  is  that  it  is  viewpoint  independent,  which  is  crucial  for  object  modeling 
and  recognition.  The  results  were  presented  at  the  IEEE  Computer  Society  Computer 
Vision  and  Pattern  Recognition  Conference  at  Miami  Beach,  Florida,  1986. 
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C.  PARALLEL  IMAGE  NORMALIZATION 


C.  1 .  Introduction 

It  is  becoming  apparent  that  architectures  for  image  processing  utilize  two 
distinct  types  of  processing  elements.  These  form  a  two-level  hierarchy  where 
dedicated  units  will  perform  the  high  speed  low-level  operations  and  more  flexible 
general  purpose  machines  will  perform  the  high-level  operation  [1],  In  particular,  the 
two-dimensional  array  structure  has  been  shown  to  provide  a  high  degree  of 
performance  for  low-level  operation.  Furthermore,  their  regular  structure  makes  them 
suitable  for  VLSI  implementation  [2], [3]. 

Image  normalization  is  an  important  function  frequently  used  in  object 
recognition  tasks  [4], [5].  By  "normalizing"  the  image  of  an  object  we  refer  to  the 
process  of  creating  a  description  that  is  invariant  to  the  position,  orientation,  and  size 
of  the  object  in  the  image. 

A  mesh  structure  with  one  PE  per  pixel  matches  the  structure  of  the  image 
data  and  thus  the  normalization  task  essentially  requires  mapping  each  pixel  to  a  new 
pixel  location.  Then  the  process  is  reduced  to  routing  pixel  data  through  the  mesh. 

C.2.  Processing  Structure 

A  four-neighbor  connected  mesh  architecture  is  assumed  where  there  is 
one  PE  per  pixel.  Each  PE  is  capable  of  performing  addition,  multiplication,  and 
comparison  operations.  In  addition,  it  can  maintain  a  FIFO  queue.  The  queue  will 
temporarily  store  routing  information.  The  routing  information  or  "pixel-data"  for  each 
PE,  consists  of  three  fields,  pixel  value,  destination  address,  and  adjacent  boundary 
pixels. 


C.3.  Parallel  Normalization 


Basically,  image  normalization  is  a  mapping  process  of  each  object  pixel 
from  its  original  position  to  its  destination  determined  by  the  normalization  parameters: 
translation  vector,  rotation  angle  and  scale  factor.  The  inside  of  objects  have  to  be 
filled  after  the  mapping  when  the  scale  factor  is  greater  than  1  or  the  rotation  angle  is 
not  an  integer  multiple  of  90  °.  Before  filling  the  inside,  we  have  to  reconnect  the 
disconnected  boundary.  That  is,  the  overall  normalization  process  consists  of 
calculation  of  destination,  mapping,  boundary  reconnection,  and  filling. 

Let  g(T,6,s)  be  the  mapping  function,  where  T  is  the  translation  vector,  0 
the  rotation  angle,  and  s  the  scale  factor.  If  we  use  g  directly  when  mapping,  the 
boundary  reconnection  is  quite  complex  and  time  consuming.  In  order  to  overcome 
this  difficulty,  we  decompose  g  into  three  subfunctions:  g^T),  g2(Q),  g2( s).  The  three 
processes,  translation,  rotation,  and  scaling,  are  therefore  performed  separately  with 
g2,  and  g2,  respectively.  Then  the  boundary  reconnection  is  not  necessary  for  the  first 
two  processes,  in  which  the  boundary  is  not  disconnected.  Moreover,  the  geometrical 
relationship  between  neighbors  is  not  changed  by  s3(s).  In  other  words,  the  direction 
in  which  one’s  neighbor  will  be  found  after  the  mapping  is  the  same  as  that  before  the 
mapping.  Therefore,  we  just  have  to  store  the  information  about  which  one(s)  of  4 
neighbors  of  a  boundary  pixel  is  boundary,  for  reconnection. 

For  each  process  (translation,  rotation,  scaling),  the  following  procedures 
are  executed  in  all  PE’s  in  parallel. 

(a)  Calculation  of  destination  address 

(b)  Mapping  of  non-background  pixels 

(c)  Reconnection  of  boundary  (scaling,  s>  1 ) 

(d)  Filling  the  inside  of  object  (rotation,  scaling) 
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Mapping 


The  basic  control  scheme  of  mapping  is  the  "store-and-forward" 
mechanism.  The  mapping  of  a  pixel  is  controlled  by  the  repetitive  application  of  basic 
flow  control.  We  distinguish  between  two  types  of  controls  for  regulating  the  flow  of 
data  between  PE’s.  The  first  is  the  common  flow  control  (CFC)  illustrated  in  Figure 
C.l.  All  PE’s  transfer  data  to  the  same  neighbors  simultaneously.  The  second  control 
mechanism  is  the  discriminate  flow  control  (DFC)  where  the  array  is  partitioned  into 
disjoint  sets.  Each  set  impelments  one  of  the  forms  of  common  flow  controls.  Some 
examples  which  arc  used  in  scaling  and  rotation  are  shown  in  Figure  C.2  and  C.3.  A 
set  of  flow  controls  applied  one  at  a  time  is  called  a  cycle  and  each  element  of  the 
cycle  is  called  a  phase.  In  a  particular  phase,  if  a  PE  contains  data  at  the  head  of  the 
queue  to  be  transmitted  in  the  direction  specified  by  the  phase,  it  transfers  the  data. 

Boundary  Reconnection 

Local  information  about  the  direction  of  adjacent  boundary  pixels  is 
available  in  each  PE  which  holds  a  boundary  pixel  after  mapping.  If  an  adjacent  pixel 
in  one  of  those  directions  is  not  a  boundary  pixel,  it  is  made  a  boundary  pixel  by 
changing  its  pixel  value  and  transferring  the  information  about  the  adjacent  boundary 
pixels.  This  operation  is  repeated  (due  to  a  cycle  of  CFC’s)  until  a  boundary  pixel  is 
met.  The  same  operation  is  similarly  initiated,  if  necessary,  in  other  directions.  The 
disconnected  boundary  can  be  eventually  reconnected. 

Filling 

For  non-boundary  and  non-background  pixels,  any  adjacent  (4- 
connectivity)  non-boundary  pixel  is  made  an  object  pixel.  This  operation  is  repeated 
(due  to  a  cycle  of  CFC’s)  until  the  inside  of  the  object  has  been  filled. 


C.4.  Discussion 

Figure  C.4  contains  simulation  results  for  the  image  of  an  airplane.  It  can 
be  seen  that  the  proposed  method  works  well.  A  small  quantization  error  along  the 
boundary  of  the  normalized  image  is  observed.  We  can  consider  some  variations  of 


this  method.  Suppose  that  we  do  not  have  enough  PE’s  and  therefore  have  to  assign  a 
block  of  the  image  to  a  PE.  Then  each  PE  processes  a  block  of  the  image  assigned  to 
it  sequentially  and  exchanges  pixel  data  with  neighboring  PE’s.  This  method  may  be 
extended  to  gray  level  images  as  follows.  A  gray  level  object  is  segmented  into 
several  regions,  each  of  which  is  uniform  (same  gray  level).  Each  region  may  be 
considered  as  a  binary  object  and  normalized  using  the  same  algorithm  proposed.  The 
possible  gaps  between  regions  are  handled  by  boundary  reconnection  and  region  filling 
algorithms. 

In  translation,  routing  pixels  toward  their  destinations  takes  N/2  cycles  at 
most  where  N  is  image  size  (NxN).  In  rotation  and  scaling,  the  routing  requires  o(N) 
cycles.  Also,  it  takes  o(N)  cycles  to  reconnect  boundary  or  to  fill  the  object. 
Therefore,  the  overall  normalization  needs  o(N)  steps  compared  to  o (W2)  by  a 
sequential  method. 


Figure  c.  1:  Coaaon  Flow  Control 
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C.2|  Dlaorlnlnat*  Flow  Control 
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C.3.*  Dlaorlnlnat#  Flow  Control 
for  Rotation  by  Hultlplaa  of  90  degree 
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Figure  C.4: 

a)  Original  laage  b)  Rotation 

e)  Sealing  1.12  d)  Sealing  0.7 

a)  Sealing  0.7  and  Rotation  -30° 
F)  Rotation  -30°  and  Sealing  0.7 
g)  Rotation  -*5°  and  Sealing  0.7 
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D.  CONSTRUCTION  OF  SURFACE  REPRESENTATION 


FROM  3-D  VOLUMETRIC  SCENE  DESCRIPTION 

D.l.  Introduction 

This  research  is  aimed  at  developing  a  versatile  surface  representation  of 
3-D  objects  from  the  volumetric  scene  description  developed  by  Martin  and  Aggarwal 
[1],  [2].  The  technique  we  propose  builds  an  explicit  surface  representation  from  a 
general  description  of  a  scene  containing  several  occluding  objects.  The  scene 
description  is  obtained  by  integrating  information  from  several  2-D  images,  and  is 
recorded  as  a  hierarchical  data  structure  which  represents  a  set  of  planar  slices  of  the 
object;  each  slice  is  characterized  by  a  collection  of  2-D  shapes  which  define  the 
structure  at  that  cross  section.  A  bottom-up  approach  to  surface  construction  is 
adopted  here.  This  approach  involves  three  steps:  (1)  Contours  on  pairs  of  consecutive 
slices  are  examined.  Contours  are  associated  on  the  basis  of  the  amount  of 
overlapping  between  regions  enclosed  by  these  contours;  (2)  Surface  elements  are  fit  in 
between  pairs  of  associated  contours  to  establish  local  surface  structure;  and  (3)  These 
surface  elements  are  then  coalesced  to  form  larger  object  facets.  The  resulting  surface 
structure  is  recorded  in  a  table  of  polygonal  patches  that  forms  the  bounding  surface 
description  of  the  3-D  objects  in  the  scene.  Each  step  will  be  described  in  more  detail 
in  the  following  paragraphs.  Some  implementation  results  are  also  presented. 

D.2.  Contour  Association 

We  first  state  the  criteria  for  associating  contours.  Two  contours  will  be 
associated  if  they  are  on  a  pair  of  consecutive  slices,  of  the  same  sign  (either  both 
contours  enclose  the  object  regions  or  both  contours  enclose  the  hole  regions),  and  the 
overlapping  area  of  the  two  regions  enclosed  by  the  two  contours  is  significant 
compared  to  that  of  the  regions  themselves.  The  3-D  scene  structure  is  processed 
sequentially  two  slices  at  a  time.  If  the  overlapping  area  is  close  to  that  of  the  two 
regions  enclosed  by  the  two  contours,  both  contours  will  be  marked  as  processed  by 
assigning  to  them  a  channel  number  to  which  they  belong  for  identification.  If  the 
overlap  is  small  compared  to  that  of  the  larger  of  the  two  regions  and  large  compared 
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to  that  of  the  smaller  of  the  two  regions,  the  larger  contour  can  still  be  associated  with 
other  contours  and  hence  is  marked  after  all  association  relations  are  found.  After  the 
associations  are  completed,  contours  left  unmarked  possibly  belong  to  different  objects 
in  the  scene  and  are  assigned  new  channel  numbers.  In  this  process,  sequences  of 
associated  contours  are  recorded  by  attaching  a  unique  tag  (channel  number). 
Subsequent  triangulation  process  and  surface  hierarchy  construction  are  performed  over 
each  channel  independently. 

D.3.  Local  Surface  Construction 

After  the  association  relationships  between  contours  on  consecutive  slices 
are  identified,  we  will  then  construct  the  bounding  surface  structure  between  pairs  of 
associated  contours.  Here  we  preferred  planar  facets  to  curve  ones  because  of  their 
representation  simplicity.  Finding  planar  surface  approximation  between  pairs  of 
associated  contours  can  be  formulated  as  a  triangulation  process.  Briefly,  the 
triangulation  process  generates  a  collection  of  triangular  patches  between  the  associated 
pairs  of  contours  such  that  their  union  forms  a  closed  bounding  surface.  The  vertices 
of  the  triangles  are  the  boundary  points  on  a  pair  of  associated  contours.  Triangulation 
of  boundary  points  can  be  accomplished  by  constructing  a  graph  (or  matrix) 
representation  in  which  the  row  and  column  indices  correspond  to  the  sequence 
number  of  the  contour  points  on  a  pair  of  associated  boundaries.  A  closed  surface 
representation  can  then  be  formed  by  selecting  a  proper  set  of  triangles  such  that  the 
corresponding  edges  form  a  connected  path  of  length  m+n  in  the  graph,  where  m  and 
n  are  the  length  of  the  previous  and  current  contours  respectively.  Our  method  of 
finding  such  a  path  (or  a  triangulation)  is  based  on  the  observation  of  correlation  of 
merit  assignments  among  neighboring  triangles.  This  observation  suggests  that  merit 
assignment  should  go  through  an  iterative  updating  process  or  a  ’relaxation’  process  to 
incorporate  that  of  the  neighboring  triangles  to  ascertain  the  final  merit  assignment. 

The  relaxation  process  serves  as  an  early  screening  process  which  speeds 
up  the  final  structure  construction  by  reducing  the  dimension  of  the  selection  through 
an  early  pruning  of  the  triangles  with  relatively  low  merits.  However,  we  might  be 
left  with  many  promising  triangles  so  that  more  than  one  bounding  surface  structures 
can  be  built  from  them.  A  final  selection  of  the  bounding  surface  representation  is 


thus  needed  using  a  search  over  the  restricted  graph  of  possibilities.  The  well  known 
A  search  algorithm  is  repeated  for  all  pairs  of  associated  contours  to  produce  the 
local  bounding  surface  description. 

D.4.  Surface  Hierarchy  Generation 

Instead  of  incorporating  triangular  patches  found  directly  as  primitives  for 
representation,  we  establish  a  surface  hierarchy  by  coalescing  triangular  patches  with 
identical  orientation  into  polygonal  facets.  This  is  because  the  basic  triangular  patches 
are  numerous,  they  may  not  constitute  a  reasonable  depiction  of  the  3-D  objects.  A 
more  reasonable  depiction  can  be  achieved  if  we  establish  the  surface  hierarchy  by 
coalescing  the  adjacent  triangular  patches  into  polygonal  facets  such  that  the 
orientation  of  the  constituent  triangular  patches  is  preserved.  Data  reduction  is  also 
achieved  through  this  process. 

Surface  hierarchy  can  be  established  by  coalescing  adjacent  patches  in  two 
directions:  horizontal  and  vertical.  Horizontal  merging  coalesces  adjacent  patches  with 
same  orientation  within  the  same  cross  section,  whereas  vertical  merging  coalesces 
patches  resulting  from  the  horizontal  merging  across  the  whole  scene  structure.  The 
structure  obtained  from  this  merging  is  shown  in  Figure  D.l.  Each  polygonal  facet  is 
delineated  through  a  pointer  set  which  defines  the  bounding  points  for  the  facet  at  each 
slice  thus  pointer  set  will  enable  us  to  retrieve  the  detailed  structure  of  each  facet  for 
later  analysis.  Information  about  the  channel  number,  normal  direction,  and  the  size  of 
the  patch  is  also  recorded. 


Figure  D.l.  The  final  patch  structure 


5.  Experimental  Results 

Some  experimental  results  are  shown  below.  Figure  D-2.1,  D.3.1,  and 
D.4.1  show  the  wire  frame  3-D  structure  of  a  bus,  an  object  with  a  hole,  and  scene 
with  multiple  objects,  respectively.  Figure  D.2.2,  D.3.2,  and  D.4.2  are  the  surface 
structure  constructed  for  Figure  D.2.1,  D.3.1,  and  D.4.1,  respectively,  as  viewed  from 
different  angles. 
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E.  INTERPRETATION  OF  STRUCTURE  AND  MOTION 
FROM  UNE  CORRESPONDENCES 


E.l.  INTRODUCTION 

The  problem  we  address  in  this  research  is,  in  its  generality,  that  of  recovering  the 
orientation  and  position  of  a  set  of  lines  in  space  from  multiple  views  of  these  lines,  as 
well  as  the  relative  displacement  between  the  views.  Research  in  structure  and  motion 
from  images  has  concentrated  on  the  use  of  points  and  optical  flow.  There  is  also  a  grow¬ 
ing  interest  in  the  use  of  contours  and  range.  The  use  of  lines  [l]-[3]  has  been  relatively 
neglected  although  these  may  often  be  easier  to  extract  from  images.  In  [1]  a  method  has 
been  proposed  which  relies  on  the  projective  configuration  of  six  lines  in  three  views.  In 
this  paper  we  describe  a  method  based  on  the  observation  of  four  lines  in  three  views  (the 
case  of  two  views  of  any  number  of  lines  is  inherently  ambiguous).  This  method  exploits 
the  principle  of  invariance  of  angular  configuration  with  respect  to  rigid  motion  in 
addition  to  the  usual  projective  constraints.  The  method  first  solves  for  the  orientation  of 
lines  in  space.  The  rotational  component  of  motion  between  the  viewing  systems  is  then 
readily  recovered  from  these  orientations.  Finally,  the  translation  components  of  motion 
(and  therefore  the  position  of  lines  in  space)  can  be  recovered.  The  results  will  be 
presented  at  the  International  Conference  on  Pattern  Recognition  in  Paris  in  October  1986. 
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